High Performance Computing with R @ UC Santa Barbara

Mary Donovan and Sharon Solis
April 9, 2019

Outline

  • What is high performance computing?
  • Overview of UCSB resources
  • Accessing pod and knot
  • Running jobs
  • Some specifics for R
  • An example
  • Give it a try
  • Demo

What is high performance computing?

plot of chunk unnamed-chunk-2

  • Multiple computer nodes, with fast interconnect, where each node consists of many CPU cores (aka “cluster”)
  • Allows multiple users to run computations simulataneously
  • Allows single users to access multiple cores and multiple nodes for parallel jobs
  • Can have high end GPU nodes (specialized processors)
  • Can have large memory nodes (approx 1TB RAM)

What is high performance computing?

Why and when to use HPC?

  • Designed for when computational problems are either too large, take too long, and/or require large file storage for standard computers

  • When HPC might not be your solution:

    • There’s lots of interaction with the program, and single runs (You need a really powerful desktop)
    • You need 1,000 nodes, but only once every 3 months (Cloud resources may be your solution)
    • You need 1,000 nodes, all the time (You need your own cluster)
    • You work with sensitive data

What is high performance computing?

  • Serial vs. Parallel computing
  • Serial:
    • one core on one node at a time
  • Parallel:
    • many tasks can be performed at once that are independent
    • or tasks can occur at the same time when 'boundry conditions' match up

Overview of UCSB resources

  • Center for Scientific Computing
    • pod cluster (2018), knot cluster (2011), braid (condo clusters)
  • Letters and Science Information Technology
    • Aristotle cloud cluster
      • ideal for teaching e.g., Jypter notebooks
      • presistent internet connection collected data
  • Extreme Science and Engineering Discovery Environment (XSEDE)
  • West coast consumer wide consumer grade GPU cluster (machine learning)
    • Nautilus cluster
  • Triton Shared Computing Cluster (TSCC) at San Diego Supercomputing Center (SDSC)

Overview of UCSB resources

plot of chunk unnamed-chunk-2

  • Campus available cluster Knot (CentOS/RH 6):
    110 node, ~1400 core system
    4 ‘fat nodes’(1TB RAM)
    GPU nodes (12 M2050’s) (now too old)

  • Campus available cluster Pod (CentOS/RH7):
    70 node, ~2600 core system
    4 ‘fat nodes’(1TB RAM)
    GPU nodes (3) (Quad NVIDIA V100/32 GB with NVLINK)
    GPU Development node (P100, 1080Ti, Titan V)

  • Condo clusters: (PI’s buy compute nodes)
    Guild (60 nodes)
    Braid (120 nodes, also has GPUs)

Accessing UCSB Resources

Campus Champion (Sharon Solis): Represents XSEDE on the campus xsed logo

Using the UCSB Clusters - Pod and Knot

plot of chunk unnamed-chunk-2

Using the UCSB Clusters - Pod and Knot

plot of chunk unnamed-chunk-2

  • login node versus compute nodes - don't run stuff on the login node!
  • limit to 4TB-10TB in your home directory and remove it when you're done!
  • command line interface

Some basic commands

  • pwd
  • cd
  • ls
  • mkdir
  • mv
  • rm
  • nano (link to SC here) .bash_history (up arrow, stores 1000 lines) more .bash_history | grep qsub

Running jobs

  • scheduler
    • fair share model
  • queues showq

Some specifics for R

  • versions
  • libraries

An example

  1. login to cluster
  2. transfer input files
  3. create a submission script
  4. submit your job and check the status
  5. scheduler runs computation on compute nodes
  6. examine and transfer output files

Give it a try

  1. login to cluster
ssh username@pod.cnsi.ucsb.edu

Give it a try

  1. transfer input files
scp file.txt user@pod.cnsi.ucsb.edu:file_copy.txt

Lets also make a quick R code to run

echo “data <- data.frame(x=seq(1:10),y=seq(1:10)); write.csv(data,”testcsv.csv”,row.names=F)“ > myscript.R

Give it a try

  1. create a submission script
nano submit.job
#!/bin/bash -l
#Serial (1 core on one node) job...
#SBATCH --nodes=1 --ntasks-per-node=1

cd $SLURM_SUBMIT_DIR
module load R
Rscript myscript.R

Give it a try

  1. submit your job and check the status
sbatch submit.job
qstat -u mdono

Give it a try

  1. wait…

    via GIPHY

Give it a try

  1. transfer output files
scp file.txt user@pod.cnsi.ucsb.edu: file_copy.txt

Some tips

  • Remote login: UCSB VPN
  • Try a small version of your code on your computer first to make sure it runs from beginning to end.
  • Be explicit about location of input and output files
  • Please include in your papers! “We acknowledge support from the Center for Scientific Computing from the CNSI, MRL: an NSF MRSEC (DMR-1720256) and NSF CNS-1725797.”